On this page

Skip to content

Considerations for Elasticsearch Dynamic Field Mapping

TLDR

  • Explicit Mapping is strongly recommended for production environments to ensure performance and storage efficiency.
  • Dynamic Mapping causes string fields to generate dual indexes for text and keyword, leading to wasted storage space.
  • Special features (such as geolocation, nested objects, and custom analyzers) cannot be automatically enabled via dynamic mapping and must be defined manually.
  • Over-reliance on dynamic mapping can trigger a "Mapping Explosion," causing the number of index fields to exceed the limit (default is 1000).
  • It is recommended to set the dynamic parameter to false or strict to prevent the index structure from becoming uncontrollable.

Misconceptions About Dynamic Field Mapping

In Elasticsearch, while Dynamic Mapping provides convenience during the initial development phase, it hides several risks in production environments.

1. String Types Lead to Storage Bloat

When you might encounter this issue: When the database automatically infers string field types.

Elasticsearch stores strings as both text and keyword types by default. text is used for full-text search, while keyword is used for exact matching and aggregations. This dual-indexing mechanism leads to a significant increase in storage space. Unless necessary, you should explicitly define field types to save space.

2. Special Features Cannot Be Automatically Enabled

When you might encounter this issue: When you need to use geolocation, nested structures, or custom analyzers.

Dynamic mapping can only handle basic types and cannot recognize specific requirements:

  • Geolocation: If not pre-defined as geo_point or geo_shape, the system treats it as a standard object, making it impossible to use geolocation query APIs.
  • Nested Objects: Dynamic mapping treats nested objects as flattened object types, causing objects within arrays to be queried incorrectly.
  • Custom Analyzers: Dynamic mapping always uses the default standard analyzer, making it impossible to apply Chinese word segmentation or synonym processing.

3. The Risk of Mapping Explosion

When you might encounter this issue: When the data source contains a large number of non-fixed field names (such as user-defined fields).

If there are too many fields in an index, it causes a surge in memory consumption. Elasticsearch limits each index to a maximum of 1000 fields by default; once this limit is exceeded, the system will reject new document writes.


Dynamic Mapping Type Inference Rules

The rules by which Elasticsearch automatically infers types based on data content are as follows:

JSON Data TypeElasticsearch Type ("dynamic":"true")Elasticsearch Type ("dynamic":"runtime")
nullNo field addedNo field added
true or falsebooleanboolean
doublefloatdouble
longlonglong
objectobjectNo field added
arrayDepends on the first non-null value in the arrayDepends on the first non-null value in the array
string passing date detectiondatedate
string passing numeric detectionfloat or longdouble or long
string failing date or numeric detectiontext with a .keyword sub-fieldkeyword

Dynamic Parameter Configuration Options

To control the index structure, it is recommended to adjust the dynamic parameter based on your scenario:

  • true (default): New fields are automatically added to the mapping. Suitable for the development phase; not recommended for production environments.
  • runtime: New fields exist as runtime fields; they are not indexed and are calculated on the fly during queries. Suitable for fields that are not frequently queried, saving storage space but resulting in poorer query performance.
  • false: Ignores new fields. Data will still appear in _source, but it cannot be searched or indexed. This effectively prevents Mapping Explosion.
  • strict: Throws an exception and rejects writes immediately when a new field is detected. This is the strictest control method, suitable for production environments with high structural requirements.

Conclusion

While dynamic mapping is convenient, it is recommended to plan your Schema in advance and use Explicit Mapping in production environments. This ensures an optimal balance between storage space, query performance, and functional requirements, while avoiding the high costs of reindexing necessitated by structural changes later on.


Change Log

  • 2025-10-04 Initial document creation.